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Abstract 



This study was conceived in response to criticisms of the current 
TOEFL listening comprehension test-item format. Major areas of 
criticism have included speculation that listening as tested places 
too much burden on short-term memory as opposed to comprehension, that 
a knowledge of reading is required in order to respond successfully, 
and that many items appear to require mere recall and matching of 
details rather than higher-order processing skills. To address these 
criticisms in turn, a study was designed with 120 ESL learners and 
three listening tests (comprised of 144 total real and adapted TOEFL 
test items) to examine the characteristics of item functioning under 
conditions of stimulus repetition versus nonrepetition, variation of 
length of aural stimulus passage and of associated numbers of items, 
shorter versus longer reading response options, and higher versus 
lower level of processing skills required. Those item types and 
stimulus conditions that were found to associate with superior item 
functioning as indicated by estimates of item difficulty, item 
discriminability , internal consistency reliability, fit to a latent 
trait model, and convergent and discriminant validity were identified. 

Results suggested that, while repetition of the stimulus passage 
predictably tended to reduce item difficulty when control was made for 
concomitant influences, there was no consistent effect of stimulus 
passage repetition on item discrimination, Rasch model fit, or 
discriminant validity across difficulty level. However, there was a 
tendency for items in the no-repetition condition to exhibit greater 
convergent and discriminant validity than items in the one -repetition 
condition. 

Although passage length was confounded with numbers of items per 
passage and with comprehension hierarchy level, the test with passages 
of three -sentence length tended to be more reliable than the test with 
passages of two-sentence length, and the test with passages of two- 
sentence length tended to be more reliable than the test with passages 
of one -sentence length. Also, the test with the longest passages 
tended predictably to be slightly more difficult than the test with 
the shortest passages. 

Item response-option length was significantly related to item 
difficulty and Rasch model fit in the direction that items with 
options that were shortened to about half current TOEFL response- 
option length tended to be easier and to exhibit better fit than items 
with current longer options. Also, items with shortened options 
showed greater convergent and discriminant validity across levels of 
difficulty than did items with unshortened options. And, there was a 
near -significant tendency for items with shortened options to exhibit 
better discrimination than items with unshortened options, when 
concomitant influences were controlled. 



Comprehension hierarchy level of Items, as defined by the length 
of passage required to respond correctly, was not significantly 
related to Item difficulty except through a complex option- length-by- 
hlerarchy- level Interaction. However, hierarchy level was related to 
discrimination and Rasch model fit in the direction that items with 
lower level of processing (i.e., those that required comprehension of 
less stimulus text) showed better fit and discrimination than higher- 
level items after concomitant influences were removed. Also, greater 
convergent and discriminant validity across difficulty levels was 
exhibited by lower- level comprehension items than higher- level items. 

It was concluded that tasks like those employed in TOEFL 
Listening Comprehension Section A would benefit from a shortening of 
current response -option length, but that it was not beneficial to 
repeat stimulus passages, nor was it desirable to increase the 
proportion of items that depended on comprehension of greater rather 
than lesser amounts of text. 
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A. PROBLEM 



The current TOEFL listening comprehension component has 
demonstrated highly satisfactory levels of internal-consistency 
reliability and criterion-related validity (Hale, Stansfield, & Duran, 
1984; Pike, 1979). Nevertheless, at least three criticisms of this 
component have been expressed by some TOEFL users, including former 
members of the TOEFL Committee of Examiners and the TOEFL Research 
Committee who were requested to offer such criticisms. First, it has 
b en alleged that the format used places too much load on short-term 
memory as opposed to comprehension. Secondly, it has been claimed 
that the use of a reading response format invalidates the test as a 
measure of listening comprehension only. Finally, it has been 
asserted that too many items require recall of minute details rather 
than higher-level processing strategies (Savignon, 1986; Stansfield, 
1986) . Other criticisms related to communicative focus and language 
authenticity have also been advanced (Bachman, 1986; Duran, Canale, 
Penfield, Stansfield, & Liskin-Gasparro, 1985). However, for the most 
part, these last concerns appear to have been discussed already by 
others (Larsen- Freeman, 1986; Oiler, 1986) and are not included as 
foci in the present research. 

With regard to the first criticism, related to memory versus 
comprehension becoming the focus of the test, it should be noted at 
the outset that some element of memory use would necessarily be 
present in any analysis of the listening comprehension construct. 
Unfortunately, as Carroll (1971), Devine (1978), and Larson, Backlund, 
Redmond, and Barbour (1978) have noted, there does not appear to be 
any scientific consensus about the exact nature of the listening 
comprehension construct or its components. Thus, there is no 
agreement on what portion of listening comprehension may be 
attributable to memory (whether short term or long term) , or when that 
portion has been exceeded with any proposed listening comprehension 
task. What does appear possible to ascertain through a study of the 
kind presented here is whether variation of memory taxation in a 
listening comprehension task differentially and systematically affects 
item quality. Item quality can be determined operationally in terms 
of appropriate difficulty for the population of interest, higher 
rather than lower discriminability , internal consistency of item 
subgroup ings , and greater convergent and discriminant validity of the 
particular item format referenced. 

With regard to the critici.sin of the use of reading response 
options for listening comprehension items, several replies are 
possible, but each calls for empirical evidence. Carroll (1971) 
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conducted an extensive review of the literature of that time dealing 
with the comprehension of meaningful verbal discourse. Evidence 
gathered there suggests that use of a combined listening and reading 
presentation mode may be advantageous at some levels of learner 
proficiency, but may interfere at other levels. It may also be 
asserted that, because language skills are known to be highly 
intercorrelated in general (Oiler. 1979), and, because at least the 
TOEFL listening passage stimulus ai^d item stem or prompt are presented 
aurally rather than in writing, use of a multiple -choice reading 
response fomat for a listening comprehension task would not. 
appreciably contaminate the validity of the component as a measure of 
listening comprehension. However, if some reading contamination were 
found to be present for some items through inspection of excessive 
correlations with an independent reading measure, it would become 
useful to discover ways of minimizing such contamination. This would 
be particularly true if the correlations of the listening items with 
the reading criterion were higher than the correlations of those same 
items with their own listening subscale total. One obvious way to 
minimize such potential contamination effects would be to reduce the 
length of the reading response task. The present study considers item 
quality as described above under two different levels of item 
response -option reading length, i.e., current TOEFL response -option 
length and an adaptation of current response -option length made by 
shortening response options to about half their current length. 

With regard to the final criticism considered here, the one 
dealing with the cognitive processing hierarchy level addressed by the 
items , it has not alwuys been 'easy for experts to reach consensus on 
exactly what constitutes higher and lower order of processing for any 
given set of comprehension items. Alderson (1986) found that experts 
could reach consensus on only one -third of a set of ESL reading 
comprehension items as to which items involved higher-order and which 
items involved lower-order cognitive processing. Even more disturbing 
for comprehension theorists has been his finding that, for those items 
for which consensus on classification was reached, the lower-order 
items systematically outperformed the higher-order items 
psychometrically. Due to. this anticipated difficulty in achieving 
consensus on classification, for the present study items were 
classified by processing hierarchy in accordance with the breadth of 
stimulus passage information needed to b« processed before the correct 
answer could be given. "Higher-order" items were those that required 
understanding of information across two or more sentences, while 
lower-order items could be answered correctly on the basis of 
understanding of infomation found in a word or phrase within a single 
sentence of stimulus discourse. Once again, it was thought possible 
to determine item quality with reference to the criteria listed above, 
this time with respect to items at differing levels of processing 
hierarchy. 

In a related research study. Powers (1985) analyzed survey 
responses of 144 university professors from 28 institutions to 
determine, among other things, which listening comprehension tasks 
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were judged most appropriate for Inclusion In a test of ESL listening 
comprehension. Of 23 general and specific tasks examined, responding 
to questions involving comprehension of nvunbers and nvunerlcal 
concepts, providing Inferences and deductions, answering with recalled 
details, and condensing what is heard to outline form were the tasks 
most highly ranked by respondents. Although that study differed from 
the present one in that there was no investigation of item functioning 
in the former study, it is nevertheless interesting that there was no 
clear preference established for tasks involving higher-order over 
lower- order processing strategies. 

B. PURPOSE 

The present study was conducted to examine the effects of varying 
memory load through use of repetitive and nonrepetltlve aural 
presentation procedures and through use of varying passage length 
formats. By varying repetition condition, passage length, and numbers 
of associated items, it was considered possible to investigate effects 
of these controlled variations on item difficulty, item 
dlscrlmln?blllty , and format validity. 

Additionally, the study was designed to consider the influences 
of varying length of reading task in the item response options . Two 
levels of reading length were examined (current TOEFL listening 
response option length and a systematically shortened version of the 
currently employed format) . Again, the effects of varying option 
length were compared for measures of item difficulty, item 
discriminability , and format validity. 

Of further Interest was an investigation of the comparative 
performances of listening comprehension items at three levels of the 
processing hierarchy, from memory for details within single sentences, 
to memory for information presented across two sentences, to 
comprehension of information encountered across three passage 
sentences. Previous research in the measurement of reading 
comprehension has called attention to the difficulty of reliable 
classification of hierarchies of cognitive processing (Alderson, 
1986). It was hoped that this strategy of classification according to 
extent of context upon which the item is based would help to overcome 
this classification difficulty in the case of listening comprehension 
assessment. Once again, comparisons of item difficulty, item 
discriminability, and format validity were made under each of the 
levels of processing hierarchy. 

Specifically, the principal variables of Interest in this study 

were: 

(1) Repetition of stimulus passage . Two levels of repetition 
were considered: no repetition and one repetition. 

(2) Passage length . Three levels of passage length were 
considered: passages of one-, two-, and three -sentence length. 
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containing approximately 10, 20, and 30 words, with one, two, or three 
associated tej t items, respectively. 



(3) Reading response option length . Two levels of multiple- 
choice response-option length were considered: current TOEFL 
listening response -option length and a highly shortened version of the 
current TOEFL response -option length. The current TOEFL response - 
option length averaged 6.89 words with a standard deviation of 1.25 
vords for the 72 unshortened items in the study. The shortened option 
length averaged 3.34 words with a standard deviation of 0.76 words for 
the 72 shortened items of the study. Thus, on average, the shortened 
options were slightly less than half the length of the tanshortened 
options. Additionally, it should be noted that stems consisting of 
from one to three words were added to many of the shortened items to 
facilitate reduction of overall length. The stems averaged 1.79 words 
in length with a standard deviation of 0.91 words across the 72 
shortened items. 

(4) Processing hierarchy . Items were designed to measure three 
levels of comprehension: comprehension of discrete details within 
single sentences, comprehension of infomnation presented across two 
sentences, and comprehension of information presented across three 
sentences. Thus, processing hierarchy was defined operationally in 
temns of the comparative length of the stimulus passage required to be 
processed in order to obtain the answer to the item. 



C . METHOD 



1 . Sample 

A sample of 120 subjects was identified from among the English 
as -a- second- language (ESL) students at three U.S. schools (Santa 
Monica Community College, UCLA Extension, and New York University 
American Language Institute). Subjects varied widely in language 
proficiency, language background, time of residence in an English- 
speaking country, and time of formal English language study. (See 
Table 1 for a summary of subject characteristics.) All subjects 
volunteered to participate in consideration of the test-taking 
practice opportunity, the award of TOEFL practice materials, or 
nominal equivalent monetary compensation. 



2 . Instrtimentation 

The following instrtiments were designed or adapted for the study: 

(a) A brief, one -page demographic questionnaire requesting 
information about native language background, length of residence in 
an English-speaking country, and length of English language study. 
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(See the Appendix for a copy of this questionnaire.) 



(b) A listening comprehension test with 48 one-sentence stimulus 
passages and 48 related items (1 item per passage). These passages 
were varied so that 24 were repeated once and 24 were not repeated. 

Of the 24 repeated passages (24 items) and again of the 24 nonrepeated 
passages (24 items) , 12 associated items exhibited shortened reading 
response-option format and 12 associated Items exhibited current 
unshortened reading response -option format. All items were copied or 
adapted from prior, disclosed TOEFL forms. (See the Appendix for a 
copy of this test, labeled "Listening Comprehension - 1.") 

(c) A listening comprehension test with 24 two-sentence stimulus 
passages and 48 related items (2 items per passage) . These passages 
were varied so that 12 were repeated once and 12 were not repeated. 

Of the 12 repeated passages (24 items) and again of the 12 non- 
repeated passages (24-ltems), 12 associated items exhibited shortened 
reading response option format and 12 associated items exhibited 
current unshortened reading response option format. Distributed 
evenly and systematically throughout the test were items representing 
two levels of processing hierarchy (24 items for level one and 24 
items for level two as described above- -one item of each level for 
each passage) . All items were copied or adapted from prior disclosed 
TOEFL forms. (See the Appendix for a copy of this test, labeled 
"Listening Comprehension - 2.") 

(d) A listening comprehension test with 16 three - sentence 
stimulus passages and 48 related items (3 items per passage) . These 
passages were varied so that 8 were repeated once and 8 were not 
repeated. Of the 8 repeated passages (24 items) and again of the 8 
nonrepeated passages (24 items) , 12 associated items exhibited 
shortened reading response -option format and 12 associated items 
exhibited current unshortened reading response -option format. 
Distributed evenly and systematically throughout the test were items 
representing three levels oif processing hierarchy (16 items for each 
of levels one, two and three as described above- -1 item of each level 
for each passage). All items were copied or adapted from prior, 
disclosed TOEFL forms. (See the Appendix for a copy of this test, 
labeled "Listening Comprehension - 3.") 

In construction of the three 48- item listening comprehension 
tests described in b, c, and d above, use was made of the items in 
only part A of the listening comprehension components of the disclosed 
forms from the August 1985. July 1986, and November 1987 
administrations of the TOEFL test. 

(e) A disclosed TOEFL reading comprehension component test was 
administered to all subjects to provide a concomitant measure of 
reading ability. (See the Appendix for a copy of this test.) 

(f) A 15-item digital memory test was administered to all 
subjects to provide a concomitant measure of short-term memory. (See 
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the Appendix for a copy of this test.) 



3 . Procedure 

All 120 subjects responded to all tests and questionnaires. To 
ensure that each listening test was encountered in every sequence by 
the same number of subjects to control for practice and sequence 
effects, the subjects were sequentially assigned to three permutations 
of sequence order (approximately 40 subjects per permutation). Thus, 
for person group one, listening comprehension tests were administered 
in the sequence 1, 2, 3. For person group two, listening 
comprehension tests were administered in the sequence 2, 3, 1. For 
person group three, listening comprehension tests were administered in 
the sequence 3, 1, 2. To control for within- test sequence effects, 
items were coded numerically by sequence and the resulting sequence 
variable was employed as a concomitant variable in the study after 
homogeneity of regression assumptions were shown to be satisfied. 
Also to minimize practice and sequence effects , feedback on item 
success or failure was not given at any time during test 
administration. Total testing time for all tests did not exceed two 
hours per subject. Balanced subsets of items and their associated 
stimulus sentences appeared in more than one test form. To control 
for any possible multiple -encounter effect, items were also coded 
numerically in accordance with the number of encounters across tests. 
The resulting encounters variable was employed as a concomitant 
variable after homogeneity of regression assvimptions were shown to be 
satisfied. See Table 2 for a more thorough representation of the 
experimental design. 

4. Analyses 

Means, standard deviations, and internal consistency 
reliabilities were calculated for each test and subtest variation. 
Sample demographic information was also tallied. 

Rasch model li,em difficulty and fit estimates and both biserial 
and point biserial item-total score discriminability indices were 
computed for every item under every response condition. Biserial and 
point-biserial correlations of every listening comprehension item were 
computed with the digital memory and TOEFL reading test scores. Mean 
and standard deviation item difficulty, discriminability, Rasch model 
fit, item-TOEFL reading correlation, and item-digital memory 
correlation were computed for all listening comprehension items. Use 
of Rasch model item difficulty estimates was preferred over 
traditional proportion correct (p) values because the former estimates 
provided a small -sample logarithmic transformation to an equal - 
interval scale (Wright & Stone, 1979). Biserial item-total, biserial 
item-TOEFL reading, and biserial item-digital memory correlations were 
preferred over their point-biserial counterparts because of the 
assumptions of normality of distribution that were believed tenable 
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for these data. Correlational computations employed correction for 
part-whole overlap and Fisher Z transformation as needed. The Rasch 
model fit statistic "infit" was employed as an item construct" validity 
criterion (Wright & Linacre, 1984). Essentially, this fit statistic 
reports the degree of improbability of the pattern of responses to any 
item, given the pattern of responses of the same persons to all other 
items. 

Factorial analyses of variance were calculated sequentially using 
Rasch item difficulty and fit estimates and item discrimination 
indices as dependent variables. Following appropriate tests of the 
assumption of homogeneity of regression coefficients, analyses of 
covariance were conducted using item sequence and item encounters as 
concomitant variabler- to test main effects and interaction effects 
with potentially contaminating influences removed. 

Construct validity of the various item formats was assessed in 
two different ways. First, since Rasch model fit estimates provide an 
indication of the fit or response validity of the items to the 
expectations of the model, the forementioned ANOVA and ANCOVA 
procedures using fit as a dependent variable served to indicate the 
comparative validity of items under the various response conditions. 
Secondly, use was made of a procedure analogous to multitrait- 
multimethod validation procedure (Campbell & Fiske, 1959), with 
formats serving as traits and high-difficulty/low- difficulty item 
splits within formats serving to define methods. By this procedure 
each item format variation was examined for convergent and 
discriminant validity across levels of item difficulty. The 
comparative validities of format variations were ascertainable as 
comparative magnitudes of matrix diagonal coefficients. 



D. RESULTS 



1. Test and Subtest Descrir ive Statistics 

Descriptive statistics for all tests and subtests are provided in 
Table 3. Note that estimates of internal consistency reliability 
(alpha) are provided for every test and item- subtest combination. 
Note also that, since reliability is a partial function of the number 
of items in a test, the final column of the table provides Spearman- 
Brown adjusted estimates to hold the number of items constant at 50 
for all tests and subtests. 

These gross statistics across subtests reveal few significant 
differences that may be attached to particular item formats. The 
reported total test means reveal a predictable but slight tendency for 
tests with the longest passages (e.g.. Listening Comprehension 3) to 
be most difficult and tests with the shortest passages (e.g., 
Listening Comprehension 1) to be least difficult. Repetition of the 
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stimulus passages showed no consistent difference from nonrepetltlon, 
whether in terms of test difficulty or of test reliability. With the 
exception of the first test (Listening Comprehension 1), there was a 
tendency for subtests with short reading response option items to be 
both easier and more reliable than subtests with longer reading 
response options. Subtests with Items at levels of cognitive 
processing hierarchy as defined show a distinct tendency such that 
subtests with lower-order items tended to be both easier and more 
reliable than subtests with higher-order items. The shortest-passage 
test, Listening Comprehension 1, was less reliable than the second- 
shortest-passage test, Listening Comprehension 2 (.835 versus .871), 
and the second- shortest-passage test. Listening Comprehension 2, was 
less reliable than the longest-passage test. Listening Comprehension 3 
(.871 versus .890). However, it should be noted that passage length 
was confounded with numbers of Items per passage and with 
comprehension hierarchy levels of associated items. 



2. Descriptive Statistics for Independent and Dependent 
Variables 

Table 4 reports means , standard devl.atlons , standard errors and 
ranges for the item variables employed in the study. Note also that 
every item was classified according to level of repetition (1-one 
repetition, 2-no repetition), level of response option length 
(1-shortened, 2-current length), and level of co'gnltlve processing 
hierarchy (1-comprehenslon of information from a word or phrase within 
one sentence, 2-comprehenslon of information across two sentences, and 
3-comprehenslon of information across three sentences of the stimulus 
passage) . Table 4 reports results for two different difficulty 
statistics, six different discrimination statistics, and three item 
validity indicators. For reasons already given, some of these 
statistics were more appropriate than others for use in the subsequent 
analyses. Only those statistics deemed appropriate were subsequently 
employed in analyses. 

The statistics reported for TOEFL reading and digital memory 
consist of the means, standard deviations, and ranges of biserial 
correlations computed between individual item scores and reading and 
recall test scores. Similarly, the statistics reported for 
discrimination consist of means, standard deviations, and ranges of 
both biserial and point biserial correlations between individual item 
scores and listening comprehension test total scores. In all analyses 
involving computation with correlation coefficients, use was made of 
Fisher Z transformations to correct for scaling inadequacies of 
correlation coefficients. 

The Rasch model difficulty and fit data were estimated with 
Mlcroscale Version 1.20 (Wright & Llnacre, 1984). As a feature of 
that program, the mean of item difficulty statistics is arbitrarily 
set at zero. The particular fit statistic chosen was the Rasch model 
"Infit" estimate also provided by that program. This is a sensitive 
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Index of the degree of departure of individual Item responses from 
model expectation. High positive fit statistics are usually 
Interpreted as reflective of model misfit, while high negative fit 
statistics are said to represent overflt to the model (Wright & Stone, 
1979). 

Intercorrelatlons among a pertinent subset of these variables are 
reported In Table 5. The coefficients reported In Table 5 Indicate 
tliat there was a weak positive relationship (0.174) between Item 
difficulty (DIF) and the extent to which the Item required processing 
of longer versus shorter segments of the listening stimulus passage 
(HIR) . Item correlation with the digital memory (MEM) test showed a 
comparatively strong positive relationship to item correlation with 
the TOEFL reading test (EIDG) (0.397), to item-total biserlal 
discriminablllty (ITB) (0.381), and to Rasch model item inflt (FIT) 
(-0.361). (Note that a regatlve correlation with the fit statistic 
reflects a positive relationship to model fit.) A similar pattern of 
correlations was observed for item correlations with reading test 
scores (RDG) as was observed for item correlations with recall test 
scores (RCL) discussed in this paragraph. Separate analyses indicated 
that only 15 of 144 items showed higher correlation with TOEFL reading 
than with their respective corrected domain totals, suggesting that 90 
percent of all listening comprehension items could not be said to be 
contaminated by reading effects. Of those 15 deviant items, no clear 
frequency pattern was present for items of any one experimental 
condition over any other experimental condition. Similarly, only 11 
of 144 items showed higher correlation with digital memory than with 
their respective corrected domain totals, suggesting that 92 percent 
of all listening comprehension items could not be said to be 
contaminated by memory effects. Interestingly, 10 of those 11 deviant 
items were of the lowest comprehension hierarchy level, implying that 
recall of discrete information within a single sentence was more 
taxing on memory than was recall of infomatlon across two or three 
sentences. No other patterns emerged for these items. Discrimination 
was related to difficulty (-0.185), Rasch fit (-0.763), TOEFL reading 
(0.433), and digital memory (0.381). In general, the nonsignificant 
correlations reported among repetition, option length, and hierarchy 
level with many of the other relevant item variables contrasted 
sharply with the results of the ANOVA, ANCOVA, and multitrait- 
multlmethod type analyses that follow. These differences may be 
attributed to the effects of removal of interaction effects in the 
partitioning of variance, to the effects of removal of contributions 
of concomitant variables (in the case of ANOVA and ANCOVA), or to the 
effects of grouping items more directly within response conditions (in 
the case of the multitrait-multlmethod type analysis) . 



3. Factorial Analyses of Variance and Covarlance 
(a) Effects on Difficulty 

The effects of level^. of repetition, option response reading 
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length, and processing hierarchy on Rasch model item difficulty 
estimates are reported in Tables 6A, 6B, and 6C. The analysis of 
variance reported in Table 6A indicates a significant effect of option 
length on item difficulty (p - 0.018); however, this generalization 
must be qualified by the finding of a significant interaction effect 
between length and hierarchy (p - 0.015). There was also a near- 
significant tendency toward an effect of repetition on item difficulty 
(p - 0.066). This tendency became even more salient in the analysis 
of covariance reported in Table 6B after concomitant influences of 
item sequence and item encounters wsre controlled. The means reported 
in Table 6C indicate the direction of the important effects noted in 
Table 6A. Changing from one repetition of the stimulus passage to no 
repetitions tended to increase item difficulty. Changing from 
shortened option length to current longer option length tended to 
increase item difficulty. Lower-order within- sentence processing 
items tended to be easier than higher-order across -sentence processing 
items, but the highest-order three -sentence items were not more 
difficult than the second-order two-sentence itims. The significant 
length by hierarchy interaction effect was of a sort that, while 
difficulty did tend to increase with increase in response option 
length overall, at the lowest level of the processing hierarchy, items 
with shortened option length appeared more difficult than items with 
current unshortened length. However, at the second and third levels 
of the processing hierarchy, shortened option length was more strongly 
associated with lower item difficulty than was current longer option 
length . 

Table 6B reports the results of analysis of covariance using the 
item sequence and item encounters variables as the two concomitant 
variables . These variables satisfied the ANCOVA assumption of 
homogeneity of regression slopes. A: Table 6B indicates, while use of 
these concomitant variables in the analysis increased the power of 
testing, it did not alter the pattern of significance of the effects 
or the interpretation of outcomes. Multiple correlation coefficients 
accompanying each ANOVA .id ANCOVA table provide some indication of 
the overall size of effects. 

(b) Effects on Discrimination 

Tables 7A and 7B report the effects of stimulus repetition, 
option length, and processing hierarchy on item discriminability as 
computed by item- total biserial correlation. The slight tendency for 
option length to affect item discriminability, noted in the ANOVA of 
Table 7 A (p - 0.129), became more salient in the ANCOVA of Table 7B (p 
- 0.075), where the contributions of item sequence and item encounters 
are controlled in the same manner as was reported in Table 6B. Again, 
the ANCOVA assumption of homogeneity of regression coefficients was 
satisfied for the present analysis. Unlike the case with effects on 
difficulty reported earlier, there were no significant interaction 
effects in Tables 7A or 7B. The direction of the tendency of option 
length to affect discriminability was such that shorter option length 
was associated with greater discrirainability than was current longer 
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option length. The ANCOVA of Table 7b reports a significant effect of 
hierarchy level on discrimination (p - 0.036). This effect was in the 
direction that items of lower levels of the comprehension hierarchy 
tended to show greater discrimination than did items of higher levels. 
These same results were replicated for these data when point-biserial 
correlations were used instead of biserial item- total score 
correlations to reflect item discrirainability . It is important to 
note again that in all of these analyses Fisher Z transformations were 
used to enable more accurate computation with correlation 
coefficients . 

(c) Effects on Model Fit 

A final item quality criterion used in the study of impact of 
stimulus repetition, response -option length, and processing hierarchy 
was the Rasch model infit statistic generated by the software program 
Hicroscale 1.20. Since this statistic is sensitive to violations of 
unidimertsionality constraints as would occur if respondents 
differentially guessed answers to some items or if some items tended 
to measure unintended constructs, the following analyses tend to 
reflect the comparative construct or response validity of the items 
under various response conditions. Tables 8A and 8B report the 
effects of stimulus repetition, option length, and processing 
hierarchy on Rasch model infit. Table 8A provides ANOVA information 
indicating significant option length (p - 0.040) and processing 
hierarchy (p - 0.045) effects on model fit. The ANCOVA of Table SB 
indicates that no significant differences in the pattern of effects 
were observed after influence of the concomitant variables was 
controlled. The direction of these effects was such that items with 
shorter response options tended to provide better fit to the 
predictions of the model than did items with current longer response 
options, and items at levels one and two of the processing hierarchy 
tended to provide better fit to model expectations than did items at 
level three. Overall, items of lower-order processing hierarchy 
showed better construct validity than items of higher-order processing 
hierarchy as defined here and as judged in terms of impact on Rasch 
model infit. 



4. Multitrait-Multimethod Validation 

The Campbell and Fiske (1959) multitrait-multimethod validation 
procedure provides a set of criteria for establishing construct 
validity of proposed traits through inspection of an appropriate 
trait-by-method correlation matrix. An adaptation of this procedure 
was made for the present analysis in order to determine whether 
patterns of item responses under the various repetition, option 
length, and comprehension hierarchy levels would be stable across 
levels of item difficulty. This is a clear indication of the 
construct validity of tests comprised of items of the various format 
types (e.g., with passage repetition or no repetition, with shorter or 
longer response options, and with lower or higher levels of the 
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comprehension hierarchy). By this procedure, item conditions, here 
analogous to traits, would be judged to exhibit monotrait-heteromethod 
convergent validity if the correlations of test scores for the same 
traits across different levels of difficulty, here analogous to 
methods, were significantly greater than zero. As an additional step 
in the procedure, if the convergent validity coefficients considered 
in step one were also found to exceed in magnitude all adjacent 
heterotrait-monomethod coefficients, the traits associated with the 
convergent validity coefficients could be said to exhibit heterotrait- 
monomethod discriminant validity. And finally, if the convergent 
validity coefficients were found to exceed all adjacent heterotrait- 
heteromethod coefficients, the traits associated with the convergent 
validity coefficients could be said to exhibit heterotrait- 
heteromethod discriminant validity. 

Table 9 reports the multitrait-multimethod validation matrix that 
was derived from intercorrelations of scores from twelve 12- item tests 
assembled purposefully from the items of the experimental tests in the 
present study. These twelve 12 -item tests were formed by grouping 
separately high-difficulty items and low-difficulty items within two 
levels of stimulus repetition, option length, and processing 
hierarchy. For purposes of the analysis, high and low difficulty 
item groupings were considered analogous to methods in each trait 
comparision. Thus, construct validity in this study would reflect 
stability across the difficulty continuum of the item characteristic 
that is being considered. Note that the underscored coefficients in 
the diagonal of the matrix comprise the convergent validity 
coefficients, and all of these coefficients significantly exceed zero, 
so all traits show convergent validity by this lenient criterion. 
However, only the tests with items of shortened option length (LENl, 
r - 0.7';5) and the tests with items of lowest-order processing 
hierarchy (HIRl, r - 0.723) exhibited discriminant validity in all 
required comparisons. While tests prepared from items of neither 
stimulus repetition condition were completely successful in terms of 
every discriminant validity comparison, the no-repetition condition 
(RE?2, r - 0.547) showed greater convergent and discriminant validity 
than the one -repetition condition (REPl, r - 0.437). 

Results of this analysis support the use of nonrepeated listening 
stimuli over repeated listening stimuli. Nonrepetition of listening 
stimuli is the current procedure with TOEFL listening comprehension 
testing. Furthermore, the analysis provides further support for use 
of item format that is shortened in option response length from the 
currently used option length. And finally, the analysis does not 
provide evidence in support of item format that requires higher- 
rather than lower-order cognitive processing as dett^rmined by the 
comprrative length of the stimulus passage that must be processed to 
respond to the item. 
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E. DISCUSSION AND CONCLUSIONS 



This study was proposed to compare TOEFL listening comprehension 
item quality under a variety of conditions of stimulus repetition, 
response-option reading length, and cognitive processing hierarchy. 
Results support several conclusions relevant to TOEFL listening 
comprehension test item development. 

1 . Repetition and Length of Stimulus Passage and Memory Effects 
One concern expressed by critics of the current TOEFL listening 
format is that it places too much burden on short- tern memory as 
opposed to tapping comprehension. The present study attempted to 
investigate this concern in several ways. First of all, item 
performance was examined under two repetition-of -stimulus conditions 
(i.e., one repetition and no repetition). The rationale for this 
procedure was that it was believed that repetition of the stimulus 
passage would lessen the burden on memory and permit a test of the 
effects of such a reduced burden on the performance of the associated 
items. While repetition also increases the opportunity to comprehend, 
it was thought that repetition would also reinforce memory for 
infomation that was comprehended on the first exposure. Results 
suggested that, while there was a predictable trend for items in the 
stimulus-repetition condition to be easier than items in the 
nonrepetition-of-stimulus conditior (Table 6B, p - 0.052), there was 
no evidence that repetition of stimulus had any positive effect on 
item discrimination (Tables 7A and 7B) , item response Validity as 
iadicated by fit to a latent- trait model (Tables 8A and 8B) , or format 
construct validity as indicated by a procedure analogous to the 
Campbell and Fiske (1959) multitrait-multimethod validation procedure 
(Table 9). 

To investigate the effects of length of stimulus passage on the 
quality of item performance, stimulus passages were constructed with 
lengths varying from one to three sentences. Here it was thought that 
length of stimulus passage could also provide a measure of burden on 
memory. Again, while there was a predictable tendency for tests 
composed of items associated with one-sentence stimulus passages to be 
easier than tests composed of items associated with two-sentence 
stimulus passages and for tests composed of items associated with two- 
sentence stimulus passages to be easier than tests composed of items 
associated with three -sentence stimulus passages (Table 3), test 
reliability tended to increase with increase in length of the stimulus 
passage (Table 3). Internal consistency reliability estimates for 
tests of 50-ite.m length varied according to length of stimulus passage 
as follows: one-sentence passages, 0.841; two-sentence passages, 
0.875; and three -sentence passages, 0.894. Since estimates of 
internal consistency can be shown to be positively related to item 
discriminability and increased potential for empirical validity, there 
is no evidence in the present results to suggest that any additional 
burden on memory associated with either stimulus passage length or 
nonrepetition of stimulus passage will negatively affect item quality 
or task validity. 
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A final consideration relevant to the Issue of memory load In the 
assessment of listening comprehension Involved the use of art 
Independent measure of short-term memory as one of the tests In the 
study. Exeunlnee scores on the memory test were correlated with the 
120-case binary response vectors for each listening comprehension 
Item. The resulting correlation coefficients provided one Indication 
of the extent to which success with any given listening Item was 
related to short-term memory. For the 144 listening comprehension 
Items In this study, these memory- dependedness correlations were again 
correlated with such Item characteristics as whether the Item was 
associated with a repeated or a non-repeated stimulus passage, whether 
the written response options for Items were of the shortened or 
unshortened variety, and whether the level of comprehension hierarchy 
was one, two, or three, as defined, for ar.y given item (Table 5). 
Results suggested that, while memory dependedness was an Important 
item characteristic as indicated by its significant correlations with 
estimates of item discriminability and model fit, there was no 
significant relationship (whether attenuated or dlsattenuaced) between 
memory dependedness of item success and level of repetition, option 
length, or comprehension hierarchy. Only 11 of the 144 items were 
found for which the correlation with the digital memory test score 
exceeded the corrected correlation with subtest total score. Thus, 92 
percent of the listening comprehension items showed greater relation 
to a measure of comprehension than to a measure of memory. 
Interestingly, 10 of the 11 deviant Items were lowest-comprehension- 
level items, suggesting that correctly responding to items requiring 
comprehension of information within a single sentence was more taxing 
on memory than was correctly responding to items requiring 
comprehension of information across two or three sentences. 

These results, taken separately and in combination, provide no 
support for the hypothesis that the current item formats in the TOEFL 
listening comprehension component overly tax short-term memory to the 
detriment of appropriate assessment of listening comprehension. While 
reduction of memory load of listening comprehension items would tend 
to result in easier items and higher test scores, such reduction of 
memory load would likely also be associated with reduction in both 
item discriminability and fit to a latent-trait model, and the 
reliability and validity of resulting tests would thereby be 
decreased. 

2. Response Option Length and Reading Effects 
Another concern of some critics of the present TOEFL listening 
item formats is related to the reliance on wrltten-response options to 
assess the ability to comprehend spoken discourse. These critics 
would maintain that use of a reading task in the assessment of 
listening comprehension serves to confound the assessed constiruct of 
listening comprehension with that of reading comprehension. Thus, it 
is alleged that the listening component is not so valid as it would be 
if the response options were presented aurally rather than in writing. 
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It should be acknowledged that there may be program- operational 
constraints associated with time available for testing that dictate 
some such format decisions within large-scale testing programs such as 
the TOEFL program. Furthermore, since, for TOEFL, primary use is made 
of examination total scores rather than of component scores for 
decision-making purposes, use of more tiearly integrative tasks within 
components would in no way compromise the validity of TOEFL total 
scores for intended uses. Nevertheless, it is an appropriate research 
question to determine whether the nature of the response task in the 
listening comprehension component of TOEFL could be altered in any way 
to improve the validity of that component. 

To investigate the concern raised here, use was made of two 
levels of reading response -option length. 72 items were employed that 
used the current TOEFL response -option length (an average of 6.89 
words per option) , and 72 items were employed using an edited and 
highly shortened response -option length (an average of 3.34 words per 
option) . The rationale for this procedure was that reduction of the 
usual reading task by about one half would enable a partial test of 
the value of such minimization of reading within the listening 
comprehension component. All of the items used were either actual, 
disclosed TOEFL listening items or were adapted from such items. 

The results of several analyses (all but the correlational 
analysis of Table 5) suggested predictably that the items with 
shortened option length were easier than items with unshortened option 
length (Tables 3, 6A, 6B, and 6C) , although the ANOVA and ANCOVA 
results were qualified somewhat by the finding of a significant 
interaction between length and processing hierarchy such that there 
was a tendency for items with shortened options to diminish in 
difficulty at the highest level of processing hierarchy (i.e., at the 
point where the task required synthesis over the greatest amount of 
passage content). Items with current, unshortened option length 
conversely tended to increase in difficulty with the increase in level 
of processing (Tables 6A, 6B, and 6C) . 

There was a nonsignificant tendency (p = 0.075) for items with 
shortened options to demonstrate greater discriminability than items 
with unshortened options when sequence and encounter effects were 
controlled through analysis of covariance (Table 7B) . There was a 
significant tendency (p - 0.040, 0.033) for items with shortened 
option length to demonstrate greater response validity than items with 
unshortened option length as indicated by effects on fit to a latent - 
trait model (Tables 8A and SB). Also, results of an analysis 
analogous to multitrait-multimethod analysis (Table 9) indicated that 
items with shortened option length, unlike items with unshortened 
option length, demonstrated discriminant validity in all required 
comparisons.. 

Although the present study was not designed to address fully the 
question of use of aural respon:(e options versus current written 
response options, results of several of the present analyses do 
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suggest that a reduction of current written response -option length by- 
about one -half does lead to improved item discriminability and greater 
format validity. It is useful to observe that shortening the response 
options in the TOEFL listening component was one of the earlier 
recommendations offered independently by Savignpn (1986). 

3 . Level of Processing and Comprehension Hierarchy Effects . 

A final concern of interest here regards the criticism sometimes 
made of the listening comprehension component of the TOEFL test that 
too much reliance is placed on item types that tap comprehension at 
the lowest level (i.e., bottom-up comprehension or memory for discrete 
details in the stimulus passage) as opposed to higher levels of 
comprehension involving top-down strategies, such as inferencing and 
synthesizing processes. Related research in the area of reading 
comprehension has reported difficulty in obtaining expert agreement on 
what levels in the comprehension hierarchy are addressed by particular 
comprehension items (Alderson, 1986). To avoid this classification 
problem, in the present study distinctions among levels of processing 
were made on the basis of the amount of stimulus passage required to 
be processed in order to respond correctly to the test item. Items 
were designed accordingly at three levels of the comprehension 
hierarchy- -that is, items requiring information successively from one, 
two, or three sentences of the stimulus passage in order to permit 
correct responding. Analyses were made of the comparative 
performances of the three item types. 

Results suggested that, while there was a slight tendency for 
subtests comprised of lower-order comprehension items to exhibit 
higher mean scores and higher reliability estimates than did subtests 
with higher-order items (Table 3), there was no consistent effect on 
item difficulty associated with level of comprehension processing 
hierarchy across the 144 items and 120 persons in tl-o present study 
(Tables 6A, 6B, and 6C) . The one possible exception involves a 
significant interaction effect between option length and processing 
level that was discussed earlier. 

There was a significant effect on item discriminability 
associated with comprehension processing level as defined (Table 7B) . 
Ti'is effect was such that items representing lower levels of the 
comprehension hierarchy tended to discriminate better than items 
representing higher levels. Also, there was a significant effect of 
processing level on Rasch model fit detected in the ANOVA reported in 
Table 8A (such that lower-order items demonstrated better fit to the 
expectations of the model and, thus, greater response validity than 
did higher-order items). This effect persevered when sequence and 
encounter scores were used as concomitant variables in the ANCOVA 
reported in Table 8B. It is also possible, since there were more 
lower-order than higher-order items in the study, that fit to the 
(Sxpectatio-<s of the model would entail greater conformity to the 
response characteristics of the lower-order items and would thus bias 
the fit statistic in favor of lower-order items. The multitrait- 
multimethod analysis (Table 9) suggested that items of lower-order 
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processing level, unlike items of higher-order processing level, 
exhibited discriminant validity in all required comparisons. 

These results do not support the view that a concerted effort 
should be made to ensure that a preponderance of TOEFL listening 
comprehension items be designed at the higher levels of the 
comprehension processing hierarchy. To the contrary, it would appear 
from the multitrait-multimethod analysis conducted that increased 
reliance on so-called lower-order items as defined for the present 
study may result in a commensurate increase in construct validity of 
the tests. It must be cautioned, however, that the practice of 
defining levels of processing for listening comprehension assessment 
by means of measures of the amount of discourse needed to be processed 
in order to respond correctly is not the only way of defining the 
comprehension hierarchy. Nevertheless, results of application of the 
present procedure appeared to underscore the psychometric value of 
lower-order comprehension items in the same way that Alderson's (1986) 
study supported their use in the assessment of reading comprehension. 

It should also be noted here that the current TOEFL listening 
comprehension component includes a variety of item types, not all of 
which were systematically considered in the present study. The study 
was further limited by its primary focus on three identified concerns 
related to listening comprehension item format. Of the three major 
concerns investigated- -memory load, reading response, and 
comprehension hierarchy- -results suggested that the one concern with 
greatest merit toward the implementation of possible improvements in 
the format of TOEFL listening comprehension items is the concern 
related to the length of the reading response options. Several of the 
analyses indicated that reducing the length of the reading response 
options in a listening comprehension test such as TOEFL Listening 
Comprehension Section A by as much as one-half the current length 
could result in enhanced item and test quality in terms of a variety 
of established psychometric criteria. It is therefore recommended 
that appropriate consideration be given to the reduction of response - 
option length in the development of future versions of the listening 
component of the TOEFL test. 
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Table 1 
Sample Description 





N 


Residence in English- 


N 


Amount of English 

T.AT»CTtlACTP Sf"tlrfv 


N 


Arabic 


3 


0-6 Months 


85 


0-6 Months 


9 




1 


6 Mont"!! «! t"o 1 Vp AT* 


15 


Monf'he; 1"o 1 Ypst 


7 




q 




12 


1-2 Years 


18 




11 


2-3 Years 


1 


2-3 Years 


11 


German 


17 


3-5 Years 


3 


3-5 Years 


15 


X 1 1 IC O X Ctl 1 


? 


Mnfp THpn S Vp a 


3 


MoTP I'Hati S Ypat«; 


59 




0 


Not" RpTinf l"prf 


1 


Not" RpTioTf'Pfi* 


1 




30 












8 










rers lan 


■1 










Polish 


2 










Portuguese 


2 










Spanish 


13 










Swahili 


1 










Thai 


6 










Turkish 


1 










Not Reported 


2 











Totals 120 120 120 
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Table 2 



Design of Listening Comprehension Test Administration 



Listening Test 1 



Listening Test 2 



Listening Test 3 



Passages : 

48 1- Sentence Passages 

Items : 

48 Items 

(1 per Passage) 

Repetition : 
First 24 Passages 

Repeated Once 
Second 24 Passages 

Not Repeated 



24 2 -Sentence Passages 



48 Items 

(2 per Passage) 



First 12 Passages 
Repeated Once 

Second 12 Passages 
Not Repeated 



16 3 -Sentence Passages 



48 Items 

(3 per Passage) 



First 8 Passages 

Repeated Once 

Second 8 Passages 
Not Repeated 



Length of Response Option : 

24 Short-Option Items 24 Short-Option Items 

24 Longer-Option Items 24 Longer-Option Items 

Level of Comprehension Hierarchy : 
48 First-Level Items 24 First-Level Items 

24 Second- Level Items 



24 Short-Option Items 
24 Longer -Option Items 



16 First-Level Items 
16 Second-Level Items 
16 Third-Level Items 



Test sequence was counterbalanced across all subjects so that each test 
was encountered In each of three sequences (i.e., 1-2-3, 2-3-1, or 3-1-2) by 
the same nvimber of persons . 

Option length was randomly stratified within test across repetition and 
hierarchy conditions so that the same number of short- and longer-option items 
occurred under each condition. 

Hierarchy was necessarily confounded with passage length as it was 
defined by the nvimber of stimulus sentences on which each answer depended. 

Items at hierarchical levels were met sequent} ^'.lly (1, 1-2, or 1-2-3) 
after each stimulus passage in accordance with passage length. 

16 Items appeared in all three tests, 24 Items appeared in two of the 
three tests, and 48 items appeared in no more than one test. In a balanced 
manner so that each subject had 48 one-tlme-ltem encounters, and 48 two-time- 
item encounters, and 48 three- time- item encounters. 

Each Item was coded for sequence and nvimber of encounters across the 
three tests in every experimental condition to enable control for possible 
contamination by these Influences. 

Every test and every item was encountered by every subject with equal 
time allowed to each subject to respond. 

Most of these design features are evident within the actual test forms 
presented In thd Appendix. 
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Table 4 

Descriptive Stat istics for Item Variances 
(N - 144 Items) 



vari.aDjLe 


Mean 




DEf 


Range 




Difficulty: 














ir J-OpO J- uXOTl V1OJ.J.6CL. ^ 




1 Q9 








^OJ 


Rasch Dirticulty (Subtest; 


. uuu 


1 no A 
1 . Uoo 


00*7 


-3.380 


to 2 




Discrimination: 














item-xotai roint jiisenai r (ouDL-esi.; 




.1 Zi<; 


OQ1 


- .108 


to 


D'fO 


Item-Total Point Biserial r (Total) 


.332 


.132 


.091 


- .076 


to 


575 


Item-Total Biserial r (Subtest) 


.526 


.239 


.092 


- .312 


to 


916 


Item-Total Biserial r (Total) 


.458 


.195 


.092 


- .220 


to 


746 


Item-Total Biserial r (Subtest) 














(Part-Whole Overlap Corrected) 


.484 


.238 


.092 


- .328 


to 


.907 


Item-Total Biserial r (Total) 














(Part-Whole Overlap Corrected) 


.412 


.195 


.092 


- .238 


to 


.723 


Construct Validity: 














Rasch Model Infit (Subtest) 


-.018 


1.076 




-3.000 


to 2 


.840 


TOEFL Reading Biserial r 


.286 


.147 


.092 


- .040 


to 


.648 


Digital Memory Biserial r 


.170 


.135 


.092 


- .101 


to 


.452 



All correlation statistics employed Fisher Z transformations . 

Subtest estimates were based on the respective 48 -item listening 
comprehension tests separately. 

Total test estimates were based on the composite of the three 48 -item 
listening comprehension tests. 
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Table 6A 

ANOVA with Rasch Difficulty as Dependent Variable (N - 144 Items) 



Source 


DF 


MS 


F 


P 


Repetition 


1 


3.719 


3.436 


0.066 


Option Length 


1 


6.265 


5.787 


0.018* 


Hierarchy 


2 


1.199 


1.107 


0.334 


REPxLEN 


1 


3.635 


3.358 


0.069 


REPxHIR 


2 


0.010 


0.009 


0.991 


LENxHIR 


2 


4.657 


4.302 


0.015* 


REPxLENxHIR 


2 


1.762 


1.762 


0.176 


Error 


132 


1.083 






(R -0.391) 










*p < 0.05 


Table 


6B 






ANCOVA with Rasch Difficulty as Dependent Variable and with 
Sequence and Encounters as Concomitant Variables 
(N - 144 Items) 


Source 


DF 


MS 


F 


P 


Repetition 


1 


4.121 


3.861 


0.052 


Option Length 


1 


6.902 


6.467 


0.012* 


Hierarchy 


2 


2.907 


2.724 


0.069 


Sequence 


1 


3.451 . 


3.233 


0.074 


Encounters 


1 


0.893 


0.837 


0.362 


REPxLEN 


1 


3.102 


2.906 


0.091 


REPxHIR 


2 


0.081 


0.076 


0.927 


LENxHIR 


2 


4.964 


4.652 


0.011* 


REPxLENxHIR 


2 


2.321 


2.174 


0.118 


Error 


130 


1.067 







(R - 0.422) 



*p < 0.05 
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Table 6C 



Means and Standard Deviations of Rasch Item Difficulty 
Estimates at All Levels of Significant Effects (N - 144 Items) 



Level 1 Level 2 Level 3 



Main Effect N 




Mean 


SD 




N 




Mean 


SD 




N 




Mean 


SD 






Repetition 


72 


-0. 


120 


1. 


201 


72 


0 


119 


0 


951 












Option Length 


72 


-0. 


111 


0 


914 


72 


0 


111 


1 


232 












Hierarchy 


88 


-0 


172 


0 


974 


40 


0 


290 


1 


263 


16 


0 


219 


1 


076 


LENxHIR (LI) 


46 


-0 


107 


0 


895 


18 


-0 


054 


1 


.072 


8 


-0 


263 


0 


700 


(L2) 


42 


-0 


243 


1 


.059 


22 


0 


.571 


1 


.359 


8 


0 


701 


1 


208 
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Table 7A 



ANOVA with Overlap -Corrected Biserial Item-Total Individual Test 
Discrimination as Dependent Variable (N - 144 Items) 



oOU,J.C6 


nv 

UE 


ilO 




F 




p 


AC Lf C L. J. J. LI 11 


1 


0 . 


003 


0. 


047 


0 


828 


Option Length 


1 


0. 


147 


2. 


405 


0 


123 


Hierarchy 


2 


U . 


kjdj 


0 


872 


n 
u 




REPxLEN 


1 


0. 


002 


0 


025 


0 


873 


REPxHIR 


2 


0. 


008 


0 


126 


0 


.882 


LENxHIR 


2 


0 


038 


0 


614 


0 


.543 


REPxLENxHIR 


2 


0 


018 


0 


295 


0 


.745 


Error 


132 


0 


061 










(R - 0.215) 

















Table 7B 

ANCOVA with Overlap -Corrected Biserial Item-Total Individual Test 
Discrimination as Dependent Variable and with Sequence and Encounters 
as Concomitant Variables (N - 144 Items) 



Source 


DF 


MS 


F 






P 


Repetition 


1 


0. 


022 


0. 


387 


0 


535 


Option Length 


1 


0. 


185 


3. 


225 


0 


075 


Hierarchy 


2 


0. 


196 


3. 


404 


0 


036* 


Sequence 


1 


0 


531 


9 


239 


0 


003** 


Encounters 


1 


0 


102 


1 


766 


0 


186 


REPxLEN 


1 


0 


000 


0 


004 


0 


.951 


REPxHIR 


2 


0 


003 


0 


059 


0 


.943 


LENxHIR 


2 


0 


.046 


0 


796 


0 


.453 


REPxLENxHIR 


2 


0 


.007 


0 


.127 


0 


.881 


Error 


130 


0 


.057 










(R - 0.344) 

















*p < 0.05 
**p < 0.01 
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Table 8A 

ANOVA with Rasch Model. FIT as Dependent Variable 
(N - 144 Items) 



Source DF MS F P 



Repetition 


1 


2. 


156 


1. 


946 


0 


165 


Option Length 


1 


4. 


746 


4. 


284 


0 


040* 


Hierarchy 


2 


3. 


517 


3. 


174 


0 


045* 


REPxLEN 


1 


0. 


001 


0. 


001 


0 


971 


REPxHIR 


2 


0. 


251 


0. 


227 


0 


797 


LENxHIR 


2 


2. 


410 


2 


175 


0 


.118 


REPxLENxHIR 


2 


0. 


629 


0 


568 


0 


.568 


Error 


132 


1 


108 










(R - 0.342) 

















*p < 0.05 



Table 8B 

ANCOVA with Rasch Model Fit as Dependent Variable and with 
Sequence and Eucountersi as Concomitant Variables (N - 144 Items) 



Source 


DF 


MS 


F 




P 


Repetition 


1 


1. 


858 


1. 


672 


0. 


198 


Option Length 


1 


5. 


188 


4. 


669 


0. 


033* 


Hierarchy 


2 


4. 


251 


3. 


825 


0 


024* 


Sequence 


1 


0 


901 


0. 


811 


0 


370 


Encounters 


1 


0 


982 


0 


884 


0 


349 


REPxLEN 


1 


0 


014 


0 


013 


0 


910 


REPxHIR 


2 


0 


180 


0 


162 


0 


851 


LENxHIR 


2 


2 


.440 


2 


195 


0 


.115 


REPxLENxHIR 


2 


0 


.537 


0 


.484 


0 


.618 


Error 


130 


1 


.111 











(R - 0.357) 



*p < 0.05 
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QUESTIONNAIRE 



NAME: 



NATIVE LANGUAGE: 



How long have you lived in the United States or in any other 
English-speaking country? 

0-6 months 6 months to 1 year 

1-2 years 2-3 years 

3-5 years More than 5 years 



How long have you studied English? 

0-6 months 6 months to 1 year 

1-2 years 2-3 years 

3-5 years More than 5 years 
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SCRIPT - 1 

(Allow 12 seconds between items. Repeat 1-24. Do not repeat 25-48.) 

1. You don't have to tell me if you don't feel like it. 

2. Jane was asked to take one of the parts in the school play. 

3. Whatever the consequences, I'm ready to try it. 

4. Cindy had the shoemaker sharpen her ice skates. 

5. He placed his chair so that he could see out the window. 

6. The gas tank is empty. 

7. Across the street is a park where we can eat our lunch. 

8. We hardly studied at all last weekend. 

9. Angela hopes to attend business school in the fall. 

10. Why don't we move the chairs inside? 

11 . How boring this homework is ! 

12. He himself didn't know what to do. 

13. His art was appreciated by the younger people at the exhibit. 

14. Sam measured the flour, sugar, and spices and then mixed in the eggs. 

15. He says he told the truth, but I don't believe him. 

16. He doesn't teach in this department. 

17. Mr. Hubbard served as chairman of the department until his retirement 
last year. 

18. I found that poem hard to understand, didn't you? 

19. I'll have to take this coat to the dry cleaner. 

20. If your plane reservations aren't confirmed forty-eight hours in advance, 
they may be canceled. 

21. I'm going to help Theresa with her math this afternoon. 

22. Julie had better go to the supermarket right away because her sister is 
coming for lunch. 

23. It seems as though we've known each other for a long time instead of just 
two weeks . 
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24. The motorcycle costs too much, don't you agree? 

25. The outdoor concert was called off due to the weather. 

26. In the basement I've discovered a defective heating unit that needs 
fixing. 

27. Kate was really feeling down in the dumps about her latest chemistry 
assignment . 

28. The person to see about housing is the dean of students. 

29. There ought to be more pencils than those left in the box. 

30. Can you read the signpost from here? 

31. After the speech came a brief question- and- answer session. 

32. She's been through a lot lately. 

33. You can expect to spend at least an hour on this reading assignment. 

34. Only Toby went to the movie. 

35 . Sue swims a mile every day to keep in shape . 

36. Jeremy does his homework in the library with Sue. 

37. That isn't all I want. 

38. I wish I had photocopied that article so that I could refer to it now. 

39. I bought this coat when I was abroad. 

40. This trip '11 be shorter on the subway than on the bus. 

41. This television program is not in the least boring. 

42. By the time we get to the airport, the plane will have taken off. 

43. Whoever wins this game gets to play against Molly in the finals. 

44. To accuse him of all people! 

45. Nobody likes grapes more than I do. 

46. The high winds resulted in heavy damage to trees and power lines. 

47. Only Bill could draw a sketch like that. 

48. Dick's parents made him spend his vacation at home. 
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NAME: 



LISTENING COMPREHENSION - 1 



Directions : For each question in this part you will hear a short sentence. Each 
sentence will be spoken one or two times. The sentences you hear will not be 
written out for you. Therefore, you must listen carefully to understand what the 
speaker says . 

After you hear a sentence, read the four choices in your test book, marked (A), 
(B) , (C) , and (D) , and decide which one is closest in meaning to the sentence you 
heard. Then, mark your answer on the test paper. 

Example I 

You will hear: 

You will read: (A) Mary outswam the others. 

(B) Mary ought to swim with them. 

(C) Mary and her friends swam to the island. 

(D) Mary's friends owned the island. 

The speaker said, "Mary swam out to the island with her friends." Sentence (C) , 
"Mary and her friends swam to the island," is closest in meaning to the sentence 
you heard. Therefore, you should choose answer (C) . 

Example II 

You will hear: 

You will read: (A) Please remind me to read this book. 

(B) Could you help me carry these books? 

(C) I don't mind if you help me. 

(D) Do you have a heavy course load this term? 

The speaker said, "Would you mind helping me with this load of books?" Sentence 
(B) , "Could you help me carry these books?" is the closest in meaning to the 
sentence you heard. Therefore, you should choose answer (B) . 



You. . . 

(A) might tell me. 

(B) should tell me. 

(C) shouldn't tell me. 

(D) needn't tell me. 



8. We studied. . . 

(A) all last weekend. 

(B) at last. 

(C) all the material. 

(D) very little. 



(A) Jane asked if she could be in 
the school play. 

(B) Jane took part of my lunch 
today . 

(C) Jane is very involved in her 
schoolwork. 

(D) Jane was offered a role in the 
play. 



(A) Angela wants to begin 
business school this 
autumn . 

(B) Until she fell, Angela 
had been planning to go 
to school . 

(C) Angela plans to attend to 
her business at school. 

(D) Attendance at Angela's 
school has declined. 



3 . Thx» consequences . . . 

(A) are already known. 

(B) won't stop me. 

(C) are known by trial, 

(D) won't ever change. 



4 . The shoemaker . . . 

(A) thought Cindy was nice. 

(B) sharpened Cindy's skates, 

(C) shined Cindy's shoes. 

(D) ga-^re Cindy rice cakes. 



He moved. . . 

(A) in his chair. 

(B) from the window. 

(C) to see better. 

(D) under his place. 



6. (A) The tank is broken. 

(B) There's no gas left. 

(C) The gas is no good. 

(D) Thanks for the gas. 



10. (A) Aren't the chairs inside? 

(B) We don't know which 
chairs to move . 

(C) I think we should take the 
chairs in. 

(D) Why do you want to move 
the chairs? 



11. This homework is... 

(A) less boring. 

(B) interesting, isn't it? 

(C) not boring, is it? 

(D) very unintt'iresting. 



12. 



He. . . 
(A) 
(B) 
(C) 
(D) 



didn't know either, 
knew what to do . 
didn't do it. 
told what he knew. 



7. Let's. . . 

(A) part across the street. 

(B) eat in the park. 

(C) pack a lunch. 

(D) cross on a hunch. 



(A) No one appreciated his art. 

(B) The artist did not care for the 
people at the exhibit. 

(C) The artist enjoyed having only 
young people at the exhibit. 

(D) The younger people liked his 
art. 



(A) Sam was doing a chemistry 
experiment. 

(B) Sam was baking. 

(C) Sam became confused. 

(D) Sam measured up to 
expectations . 



(A) He told me not to believe it. 

(B) He thinks I don't tell the 
truth . 

(C) I think his story is false. 

(D) I don't believe he lied. 



He. . . 

(A) doesn't teach well. 

(B) teaches elsewhere . 

(C) teaches history. 

(D) doesn't like teaching. 



(A) Mr. Hubbard served the 
chaiirman. 

(B) Mr. Hubbard replaced the last 
chairman. 

(C) Mr. Hubbard is no longer the 
chairman. 

(D) Mr. Hubbard was manager of the 
apartment . 

(A) It's a difficult poem, isn't it? 

(B) Didn't you find the poem we were 
assigned to read? 

(C) Wasn't it hard to stand there 
and recite that poem? 

(D) You lost the poem, didn't you? 
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19. (A) It's the only coat I have. 

(B) I can take this coat for 
you. 

(C) The dry cleaner has my 
coat. 

(D) My coat needs cleaning. 



20. Plane reservations should be... 

(A) made in advance. 

(B) confiirmed. 

(C) canceled. 

(D) canceled after 48 hours. 



21. Theresa. . . 

(A) helps with math. 

(B) goes this afternoon. 

(C) will get help in math. 

(D) teaches in the afternoon. 



22. (A) Julie had lunch at the 
grocery store. 

(B) Julie needs to buy some 
food quickly. 

(C) Julie must write to her 
sister immediately. 

(D) Julie got a better mark on 
the test than her sister. 



23. (A) We've been friends for a 
long time . 

(B) We haven't seen each other 
in a while. 

(C) We met only two weeks ago. 

(D) We hardly know anything 
about each other. 



24. (A) How much did you agree to 
pay for the motorcycle? 

(B) Don't you think that the 
motorcycle is too 
expensive? 

(C) I don't agree with you 
about the cost of the 
motorcycle . 

(D) I think you should agree 
to buy the motorcycle . 
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25. The concert... 

(A) was overdue. 

(B) was called together. 

(C) rained out. 



26 . The basement . . . 

(A) has a detective unit. 

(B) heater needs repair. 

(C) Is well covered. 

(D) prices. need fixing. 



27. Kate. . . 

(A) taught chemistry classes. 

(B) dumped chemical wastes. 

(C) was down an asslgnmert. 

(D) disliked her asslgnismt. 



28. The dean of students... 

(A) has someone to see. 

(B) houses students. 

(C) sees to housing. 

(D) prevents carousing. 



29. Some pencils... 

(A) fell In the box. 

(B) must be missing. 

(C) cost too much. 

(D) ought to be left. 



30. (A) Isn't the post office near here? 

(B) Where's the letter to be signed? 

(C) Isn't that a side street? 

(D) Can you see what that sign says? 



32. (A) She's done a great deal 
of traveling. 

(B) There's a good reason 
she ' s late . 

(C) She's just finished her 
share of the work. 

(D) Things have been difficult 
for her recently. 



33. You will. . . 

(A) expect an assignment. 

(B) spend much for this. 

(C) read less than an hour. 

(D) need an hour or more. 



34. (A) Toby went to the movie 
alone . 

(B) Toby has only gone to the 
movie . 

(C) Toby went to one movie. 

(D) If only Toby would got to 
the movie! 



35. 



36, 



Sue. 
(A) 
(B) 



Is mild mannered, 
smiles every day. 



(C) keeps escaping. 

(D) swims dally. 



Jeremy. . . 

(A) works at home. 

(B) studies with Sue. 

(C) left the library. 

(D) doesn't know Sue. 



31. (A) Only the brief questions were 
answered . 

(B) Someone was asked to give a 
speech . 

(C) The speaker spent a short while 
answering questions. 

(D) Someone arrived after the 
speech to ask some questions. 



37. (A) That's just part of what I 
want . 

(B) I don't want that at all. 

(C) I want this rather than 
that. 

(D) Other people want that, 
but I don't. 
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I should have . . . 

(A) studied photography. 

(B) made a copy. 

(C) returned the article 

(D) preferred photocopy. 



44. (A) No one accused him. 

(B) They didn't act on his 
accusation . 

(C) He accused all of them. 

(D) I can't believe they 
accused him. 



(A) I took this coat abroad with me . 

(B) This coat Is too big for me now. 

(C) I purchased this coat while out 
of the country. 

(D) This coat Is very broad In the 
shoulders . 



(A) Taking the subway would get us 
there faster. 

(B) The bus and subway take the 
same amount of time. 

(C) Going by bus would take less 
time. 

(D) The bus goes past the subway 
station. 



This program Is. . . 

(A) endlessly boring. 

(B) quite Interesting. 

(C) the least boring. 

(D) the least Interesting. 



We'll ... 

(A) leave the airport. 

(B) miss the plane. 

(C) take time off. 

(D) get the fare. 



45. (A) I like grapes better than 
anyone does . 

(B) I grow more grapes than 
anyone else . 

(C) Grapes are more nutritious 
than I thought. 

(D) Very few people like 
grapes. 



46 . (A) The strong winds broke 
tree limbs. 

(B) The high winds and heavy 
seas made us feel 
helpless . 

(C) The wind caused heavy 
flooding and drainage 
problems . 

(D) The power lines damaged 
some big trees . 



47 . (A) No one else could draw 
such a picture. 

(B) No one else was allowed to 
sketch. 

(C) Bill drew only one sketch. 

(D) Bill draws only what he 
likes . 



The winner . . . 

(A) finally played. 

(B) plays Molly. 

(C) is finalized. 
. (D) is Molly. 



48. Dick. . . 

(A) loved his parents. 

(B) spent his money. 

(C) vacationed at home. 

(D) left his home. 
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SCRIPT - 2 



(Allow 20 seconds between Items. Repeat passages 1-12. Do not repeat 
passages 13 - 24 . ) 

Items 

I to 2 1. You don't have to tell me if you don't feel like It. 

You're welcome to keep it a secret if you wish. 

3 to 4 2. Jane was asked to take one of the parts in the school play. 

She has performed well in school productions since she was a 
child. 

5 to 6 3. Whatever the consequences, I'm ready to try it. 

It sounds like an exciting thing to do. 

7 to 8 4. Cindy had the shoemaker sharpen her ice skates. 

She was getting ready for the race. 

9 to 10 5. He placed his chair so that he could see out the window. 

He always enjoyed the view of the valley in the springtime. 

II to 12 6. The gas tank is empty. 

You'd better stop soon. 

13 to 14 7. Across the street is a park where we can eat our lunch. 

There are lots of picnic tables and it is usually quiet. 

15 to 16 8. We hardly studied at all last weekend. 

Our family came for a short visit. 

17 to 18 9. Angela hopes to attend business school in the fall. 

She believes she wants to be an accountant. 

19 to 20 10. Why don't we move the chairs inside? 

It's cold here when the wind blows. 

21 to 22 11. How boring this homework is! 

Anything else is more interesting. 

23 to 24 12. He himself didn't know what to do. 

But he pretended to know the answers. 

25 to 26 13. His art was appreciated by the younger people at the exhibit. 

But the older people were sure that he had no talent. 

27 to 28 14. Sam measured the flour, sugar, and spices and then mixed in the 
eggs. 

Then he stirred it for several minutes and put it in the oven. 

29 to 30 15. He says he told the truth, but I don't believe him. 

He has a history of stretching the facts to suit himself. 
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31 to 32 



16. He doesn't teach in this department. 
Maybe you should ask next door. 



33 to 34 17. Mr. Hubbard served as chairman of the department until his 
retirement la^t year. 

His efforts to hire outstanding new faculty members will always 
be appreciated. 

35 to 36 18. I found that poem hard to understand, didn't you? 

I couldn't even decide what the main idea was. 



37 to 38 



39 to 40 



41 to 42 



43 to 44 



45 to 46 



47 to 48 



19. I'll have to take this coat to the dry cleaner. 
It's got food stains on the collar and both sleeves. 

20. If your plane reservations aren't confirmed forty-eight hours 
in advance, they may be canceled. 

So you'd better get on the phone in the office across the hall. 

21. I'm going to help Theresa with her math this afternoon. 
She's having trouble with long division and fractions. 

22. Julie had better go to the supermarket right away because her 
sister is coming for lunch. 

There's no milk or sandwich bread anywhere in the house 

23. It seems as though we've known each other for a long time 
instead of just two weeks. 

It must be because we have so much in common and agree about so 
many things. 

24. The motorcycle costs too much, don't you agree? 
We could find a car for that price. 
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LISTENING COMPREHENSION - 2 



Directions : For each question in this part you will hear a short 
passage. Each passage will be spoken one or two times. The sentences 
you hear will not be written out for you. Therefore, you must listen 
carefully to understand what the. speaker says. 

After you hear a passage, read the four choices, marked (A), (B) , (C) , 
and (D) , for each question, and decide which one is closest in meaning 
to the passage you heard. Then, mark your answer on the test paper. 
Answer two questions after each passage. 
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You. . . 

(A) might tell me. 

(B) should tell me. 

(C) shouldn't tell me, 

(D) needn't tell me. 



7. The shoemaker... 

(A) thought Cindy was nice. 

(B) sharpened Cindy's skates. 

(C) shined Cindy's shoes. 

(D) gave Cindy rice cakes. 



(A) You're thinking about telling 
me . 

(B) You like to keep secrets from 
me . 

(C) You don't feel welcome here.. 

(D) You secretly wish to feel 
welcome . 

***** 

(A) Jane asked if she could be in 
the school play. 

(B) Jane took part of my lunch 
today . 

(C) Jane is very involved in her 
schoolwork. 

(D) Jane was offered a role in the 
play. 



(A) Jane is a child who only wants 
to play. 

(B) Jane always takes part in 
school . 

(C) Jane was a good choice since she 
does well. 

(D) To perform well takes much 
practice . 

***** 

The consequences . . . 

(A) are already known. 

(B) ' won't stop me. 

(C) are known by trial. 

(D) won't ever change. 



(A) I tried it already and liked it. 

(B) I like whatever happens to me. 

(C) The consequences will be very 
nice . 

(D) It is so much fun that it is 
worth any trouble. 



10. 



11. 



(A) Shoemakers are usually 
ready . 

(B) Racing requires sharp 
skates . 

(C) Cakes must be prepared. 

(D) Cindy fell on her face. 

***** 

He moved. . . 

(A) in his chair. 

(B) from the window. 

(C) to see better. 

(D) under his place. 



(A) He called a rally in the 
springtime . 

(B) The view could be enjoyed 
after moving. 
He could see hid chair 
from the window. 

(D) Springtime was a difficult 
time . 



12. 



(C) 



***** 



(A) The tank i-? broken. 

(B) There's no gas left. 

(C) The gas is no good. 

(D) Thanks for the gas. 



(A) Stop emptying the gas from 
tha tank. . 

(B) It's better to stop giving 
thanks . 

(C) Empty the gas tank 
quickly . 

(D) You need to get gas right 
away. 

***** 



***** 
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13. Let's. . . 

(A) part across the street. 

(B) eat In the park. 

(C) pack a lunch. 

(D) cross on a hunch. 



20. Let's ... 

(A) move out of the wind. 

(B) not move Inside. 

(C) get new chairs. 

(D) chair that motion. 



14 . There ' s ... 

(A) a good place to eat. 

(B) some lunch in the park. 

(C) quite a stable. 

(D) parking in the street. 

***** 



***** 

21. This homework is... 

(A) less boring. 

(B) interesting, isn't it? 

(C) not boring, is it? 

(D) very uninteresting. 



15. We studied. . . 

(A) all last weekend. 

(B) at last. 

(C) all the material. 

(D) very little. 

16. (A) The weekend seemed entirely too 

short. 

(B) Our studies lasted all weekend. 

(C) A family visit interrupted our 
studies . 

(D) We studied hard for our family. 

***** 

17 . (A) Angela wants to begin business 

school this autumn. 

(B) Until she fell, Angela had been 
planning to go to school. 

(C) Angela plans to attend to her 
business at school. 

(D) Attendance at Angela's school 
has declined. 

18 . Angela wants . . . 

(A) No business at all. 

(B) School to end. 

(C) An accounting course. 

(D) To count her school. 



22. (A) I'd prefer anything to 

this homework. 

(B) I'm sorry to be so boring. 

(C) My homework is more 
interesting. 

(D) I like work more than 
anything. 

***** 

23. He. . . 

(A) didn't know either. 

(B) knew what to do. 

(C) didn't do it. 

(D) told what he knew. 



24. (A) He tended to know every 
answer . 

(B) He gave a false impression 
of his knowledge . 

(C) He knew m.ore than he was 
willing to show. 

(D) He didn't know where he 
had learned the answers . 

***** 



* * * * * 



19. (A) Aren't the chairs inside? 

(B) We don't know which chairs to 
move . 

(C) I think we should take the 
chairs in. 

(D) Why do you want to move the 
chairs? 
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25. (A) No one appreciated his art. 

(B) The artist did not care for the 
people at the exhibit. 

(C) The artist enjoyed having only 
young people at the exhibit. 

(D) The younger people liked his 
art. 



26. Not everyone ... 

(A) likes art. 

(B) has talent. 

(C) appreciates young people. 

(D) has the same taste. 

***** 

27. (A) Sam was doing a chemistry 

experiment. 

(B) Sam was baking. 

(C) Sam became confused. 

(D) Sam measured up to 
expectations . 



28. (A) Sam seldom stirred when he was 

busy. 

(B) Sam followed a plan In his 
cooking. 

(C) Sam tried to bake one dozen. 

(D) Sam was mixed up about what he 
was doing. 

***** 

29. (A) He told me not to believe It. 

(B) He thinks I don't tell the 
truth . 

(C) I think his story Is false. 

(D) I don't believe he lied. 



31. 



30. 



He . , 
(A) 
(B) 
(C) 
(D) 



Is often untruthful, 
says that suit stretched, 
has an unbelievable history, 
told the truth. 

***** 



32. 



He. . 

(A) 
(B) 
(C) 
<D) 



He . 
(A) 
(B) 
(C) 
(D) 



33. 



34. 



35. 



doesn't teach well, 
teaches elsewhere . 
teaches history, 
doesn't like teaching. 



might teach nearby, 
asked the department, 
came next door, 
doesn't teach as asked. 

***** 



(A) Mr. Hubbard served the 
chairman . 

(B) Mr. Hubbard replaced the 
last chairman. 

(C) Mr. Hubbard Is no longer 
the chairman. 

(D) Mr. Hubbard was manager of 
the apartment. 



He will be . . . 

(A) chairman of the 
department. 

(B) retiring next year. 

(C) a new faculty member. 

(D) remembered for 
recruitment. 

***** 

(A) It's a difficult poem, 
Isn't It? 

(B) Didn't you find the poem 
we were assigned to read? 

(C) Wasn't It hard to stand 
there and recite that 
poem? 

(D) You lost the poem, didn't 
you? 



36. I 



(A) have an Idea for a poem. 

(B) decided, didn't you? 

(C) had trouble understanding 
it. 

(D) found that poem. 
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37. (A) It's the only coat I have. 

(B) I can take this coat for you. 

(C) The dry cleaner has my coat. 

(D) My coat needs cleaning. 

38. The cleaner will... 

(A) remove stains. 

(B) sew the sleeves. 

(C) take great pains. 

(D) press the collar. 

***** 

39. Plane reservations should be... 

(A) made in advance . 

(B) confirmed. 

(C) canceled. 

(D) canceled after 48 hours. 



40. (A) They confirm reservations in 

that office. 

(B) There's someone on the phone in 
that office. 

(C) Use the phone to confirm your 
plane . 

(D) They canceled 48 planes by 
phone . 

***** 

41 . Theresa. . . 

(A) helps with math. 

(B) goes this afternoon. 

(C) will get help in math. 

(D) teaches in the afternoon. 



44. Julie .'ihould. . . 

(A) got bread for her sister. 

(B) cowe home for lunch. 

(C) stay in the house. 

(D) use up the milk. 

***** 

45. (A) We've been friends for a 

long time. 

(B) We haven't seen each other 
in a while. 

(C) We met only two weeks ago. 

(D) We hardly know anything 
about each other. 



46. We. . . 

(A) get along well with 
others . 

(B) commonly disagree. 

(C) have recently become 
friends . 

(D) stayed a long time. 

***** 

47 . (A) How much did you agree to 

pay for the motorcycle? 

(B) Don't you think that the 
motorcycle is too 
expensive? 

(C) I don't agree with you 
about the cost of the 
motorcycle . 

(D) I think you should agree 
to buy the motorcycle. 



42. Theresa needs... 

(A) a fraction of your time. 

(B) a new math book. 

(C) my help with division. 

(D) tomorrow afternoon. 

***** 

43. (A) Julie had lunch at the grocery 

store . 

(B) Julie needs to buy some food 
quickly . 

(C) Julie must write to her sister 
immediately. 

(D) Julie got a better mark on the 
test than her sister. 



48 . (A) We might as well buy a 
car. 

(B) We don't agree about the 
price . 

(C) Let's buy a better 
motorcycle . 

(D) Cars cost too much money 
anyway . 

***** 
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Script - 3 



(Allow 30 seconds between items. Repeat passages 1-8. Do not repeat passages 
9-16.) 

Items 

1 to 3 1. You don't have to tell me if you don't feel like it. 

You're welcome to keep it a secret if you wish. 
But I have some news you might like to hear too. 

4 to 6 2. Jane was asked to take one of the parts in the school play. 

She has performed well in school productions since she was a child. 
And I think she wants to be a high school drama teacher some day. 

7 to 9 3. Whatever the consequences, I'm ready to try it. 
It sounds like an exciting thing to do. 
And we've had no entertainment for weeks. 

10 to 12 4. Cindy had the shoemaker sharpen her ice skates. 
She was getting ready for the race. 
And she wanted to have every possible advantage. 

13 to 15 5. He placed his chair so that he could see out the window. 

He always enjoyed the view of the valley in the springtime. 
It reminded him of his youth and the freedom he loved. 

16 to 18 6. The gas tank is empty. 

You'd better stop soon. 
There's a station up ahead. 

19 to 21 7. Across the street is a park where we can eat our lunch. 

There are lots of picnic tables and it is usually quiet. 
We can discuss that private matter and not be overheard. 

22 to 24 8. We hardly studied at all last weekend. 

Our family came for a short visit. 
We went for a drive and talked. 

25 to 27 9. Angela hopes to attend business school in the fall. 
She believes she wants to be an accountant. 
It will be an opportunity for her to use her math skills. 

28 to 30 10. Why don't we move the chairs inside? 

It's cold here when the wind blows. 
And I'm still getting over a cold. 

31 to 33 11. How boring this homework is! 

Anything else is more interesting. 
I can't wait til this term if over. 
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34 to 36 12. He himself didn't know what to do. 

But he pretended to know the answers. 
And chided us for responding slowly. 

37 to 39 13. His art was appreciated by the younger people at the exhibit. 
But the older people were sure that he had no talent. 
No art form has ever been uniformly liked by everyone. 

40 to 42 14. Sam measured the flour, sugar, and spices and then mixed in the eggs. 

Then he stirred it for several minutes and put it in the oven. 
I waited impatiently as the cake was baking, hoping for a taste. 

43 to 45 15. He says he told the truth, but I don't believe him. 

He has a history of stretching the facts to suit himself. 
And I can never know whether he's serious or just kidding. 

46 to 48 16. He doesn't teach in this department. 

Maybe you should ask next door. 
They have more teachers than we do. 
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LISTENING COMPREHENSION - 3 



Directions : For each question in this part you will hear a short 
passage. Each passage will be spoken one or two times. The sentences 
you hear will not be written out for you. Therefore, you must listen 
carefully to understand what the speaker says. 

After you hear a passage, read the four choices, marked (A), (B) , (C), 
and (D) , for each question, and decide which one Is closest in meaning 
to the passage you heard. Then, mark your answer on the test paper. 
Answer three questions after each passage. 
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1. You. . . 

(A) might tell me. 

(B) should tell me. 

(C) shouldn't tell me. 

(D) needn't tell me. 



The consequences. . . 

(A) are already known. 

(B) won't stop me. 

(C) are known by trial. 

(D) won't ever change. 



(A) You're thinking about telling 
me . 

(B) You like to keep secrets from 
me . 

(C) You don't feel welcome here. 

(D) You secretly wish to feel 
welcome . 

(A) You'd better not tell the 
secret. 

(B) You're welcome to hear the news. 

(C) I feel happy about whatever you 
decide . 

(D) If you don't share, I won't 
either. 

***** 

(A) Jane asked if she could be in 
the school play. 

(B) Jane took part of my lunch 
today . 

(C) Jane is very involved in her 
schoolwork . 

(D) Jane was offered a role in the 
play. 



8. 



(A) I tried it already and 
liked it. 

(B) I like whatever happens to 
me . 

(C) The consequences will be 
very nice. 

(D) It is so much fun that it 
is worth any trouble . 



(A) Our lack of amusement 
makes me try strange 
things . 

(B) We won't know the 
consequences for weeks . 

(C) We can't try this new 
experiment. 

It's fun no matter how it 
sounds . 



(D) 



***** 



10 . The shoemaker . . . 

(A) thought Cindy was nice. 

(B) sharpened Cindy's skates. 

(C) shined Cindy's shoes. 

(D) gave Cindy rice cakes. 



(A) Jane is a child who only wants 
to play. 

(B) Jane always takes part in 
school . 

(C) Jane was a good choice since she 
does well. 

(D) To perform well takes much 
practice . 

Jane's participation... 

(A) was quite a production. 

(B) will help her career. 

(C) was child's play. 

(D) will cost a lot. 

***** 



11. (A) Shoemakers are usually 
ready.. 

(B) Racing requires sharp 
skates . 

(C) Cakes must be prepared. 

(D) Cindy fell on her face. 



12. (A) The shoemaker was ready 
for the race. 

(B) It's possible to have your 
cake and eat it too. 

(C) It was a racing advantage 
to have sharpened skates . 

(D) Cindy fell again at the 
same place. 
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13 . He moved. . . 

(A) in his chair. 

(B) from the window. 

(C) to see better. 

(D) under his place. 



19. Let's. . . 

(A) part across the street. 

(B) . eat in the park. 

(C) pack a lunch. 

(D) cross on a hunch. 



14. (A) He called a rally in the 
springtime . 

(B) The view could be enjoyed after 
moving . 

(C) He could see his chair from the 
window . 

(D) Springtime was a difficult time. 



15. He moved his chair... 

(A) To get out of the sun. 

(B) Because he fixed the springs. 

(C) To get out of view from the 
window . 

(D) For memories brought by the 
view. 

***** 



16. (A) The tank is broken. 

(B) There's no gas left. 

(C) The gas is no good. 

(D) Thanks for the gas. 



17. (A) Stop emptying the gas from the 
tank. 

(B) It's better to stop giving 
thanks . 

(C) Empty the gas tank quickly. 

(D) You need to get gas right away. 



20. There's . . . 

(A) a good place to eat. 

(B) some lunch in the park. 

(C) quite a stable. 

(D) parking in the street. 



21. In the park. . . 

(A) people leave their cars. 

(B) we can eat and talk alone. 

(C) it matters to be 
overheard. 

(D) our friends have a picnic. 

***** 

22. We studied. . . 

(A) all last weekend. 

(B) at last. 

(C) all the material. 

(D) very little. 



23 . (A) The weekend seemed 
entirely too short. 

(B) Our studies lasted all 
weekend. 

(C) A family visit interrupted 
our studies. 

(D) We studied hard for our 
family . 



18 . You should stop . . . 

(A) to empty the gas tank. 

(B) to say thanks for gas . 

(C) at the station for gas. 

(D) quickly for directions. 



* * 



* * 



24. (A) We studied all last 

weekend in spite of the 
visit. 

(B) We had a short visit to 
study hall. 

(C) Our family studied last 
weekend. 

(D) Driving'" and talking with 
family took time from 
study. 

***** 
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25. (A) Angela wants to begin 

business school this autumn. 

(B) Until she fell, Angela had been 
planning to go to school. 

(C) Angela plans to attend to her 
business at school. 

(D) Attendance at Angela's school 
has declined. 



26. Angela wants ... 

(A) no business at all. 

(B) school to end. 

(C) an accounting course. 

(D) to count her school. 



27. (A) Angela hopes to finish her 

business in fall . 

(B) Angela's math skill is suited to 
accounting study. 

(C) Angela finished school last 
fall. 

(D) Angela counted on no new 
opportunities . 

***** 

28. (A) Aren't the chairs inside? 

(B) We don't know which chairs to 
move . 

(C) I think we should take the 
chairs in. 

(D) Why do you want to move the 
chairs? 

29. Let's . . . 

(A) move out of the wind. 

(B) not move inside. 

(C) get new chairs. 

(D) chair that motion. 



30. Moving the chairs... 

(A) will keep us outside. 

(B) is done by the wind. 

(C) will help my cold. 

(D) is too cold to do. 

***** 



31. This homework is... 

(A) less boring. 

(B) interesting, isn't it? 

(C) not boring, is it? 

(D) very uninteresting. 



32. (A) I'd prefer anything to 
this homework. 

(B) I'm sorry to be so boring. 

(C) My homework is more 
interesting. 

(D) I like work more than 
anything . 



33 . My boredom. . . 

(A) is unexplainable. 

(B) is better than work. 

(C) is from not enough 
homework. 

(D) will end after this term. 



***** 



34 He, 



(A) didn't know either. 

(B) knew what to do. 

(C) didn't do it. 

(D) told what he knew. 



35. (A) He tended to know every 
answer . 

(B) He gave a false impression 
of his knowledge . 

(C) He knew more than he was 
willing to show. 

(D) He didn't know where he 
had learned the answers. 



36. (A) He slowly tended to learn 
what to do . 

(B) His answers were two- 
sided. 

(C) He criticized our 
responses , but he knew 
less . 

(D) He didn't know how to 
respond slowly. 
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(A) No one appreciated his art. 

(B) The artist did not care for the 
people at the exhibit. 

(C) The artist enjoyed having only 
young people at the exhibit. 

(D) The younger people liked his 
art. 



43. (A) He told me not to believe 
it. 

(B) He thinks I don't tell the 
truth . 

(C) I think his story is 
false . 

(D) I don't believe he lied. 



Not everyone . . . 

(A) likes art. 

(B) has talent. 

(C) appreciates young people. 

(D) has the same taste. 



Art forms . . . 

(A) are not liked by youth. 

(B) are sure to require talent. 

(C) are not equally appreciated. 

(D) are exhibited by older people. 

***** 



(A) Sam was doing a chemistry 
experiment . 

(B) Sam was baking. 

(C) Sam became confused. 

(D) Sam measured up to 
expectations . 



(A) Sam seldom stirred when he was 
busy. 

(B) Sam followed, a plan in his 
cooking. 

(C) Sam tried to bake one dozen. 

(D) Sam was mixed up about what he 
was doing. 

Sam' s work. . . 

(A) made me hungry. 

(B) mixed me up. 

(C) tired me out. 

(D) stirred us all. 

***** 
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44. He . . . 

(A) is often untruthful. 

(B) says that suit stretched. 

(C) has an unbelievable 
history. 

(D) told the truth. 



45 . (A) His manner of speaking 

makes him hard to believe. 

(B) He is usually far too 
serious . 

(C) He tells the truth no 
matter what happens . 

(D) His kidding is a source of 
enj oyment . 

***** 



46 . He . . . 

(A) doesn't teach well. 

(B) teaches elsewhere. 

(C) teaches history. 

(D) doesn't like teaching. 

47. He ... 

(A) might teach nearby. 

(B) asked the department. 

(C) came next door. 

(D) doesn't teach as asked. 



48. (A) He has already finished 
his teaching. 

(B) This department has 
offices next door. 

(C) Since he doesn't work 
here , perhaps he is in a 
larger department . 

(D) Maybe he is not a teacher 
after all. 

***** 
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DIGITAL MEMORY TEST 
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DIRECTIONS (presented orally by recording) : Listen carefully to the following 
numbers. Try to remember as many numbers as you can. Do not write anything 
until you are told. After listening, you will have two minutes to write as 
many of these numbers as you can remember. Now listen carefully. 

4 

39 

5 

44 

14 

84 

1 

8 

43 
92 
29 
12 
59 
48 
50 

Now write as many numbers as you remember. 
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